-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More strategies for NumPy #472
Conversation
Thanks for doing this! I haven't looked into it in extreme detail yet (but I will). Immediate feedback on the desired points:
Yes, looks correct to me.
So the numpy package already has problems with producing values and minimising (when the dimensions are large), which are mostly problems caused by some of the areas that Hypothesis is not currently very good at (basically generating large amounts of data). I've been thinking about ways to mitigate that - basically moving to something where the values it generates are sparse differences to some default value, but I haven't experimented in detail. You'd be welcome to either experiment with the same yourself or consider that out of scope for this as you prefer. One area where you might want to investigate explicitly what minimizing looks like is when the dimensions come from a strategy. That sort of length parameter + drawing a fixed amount of data shrinks a lot better than it used to in Hypothesis, but it's still not as good as something that is more directly adapted to Hypothesis's approach. It should be at worst ok though.
I need to think about this a bit more. My feeling is that given the relatively small number of options, the explicit sizes parameter might be the better choice of the two.
I need to think about this a bit more too. It might be useful to have some shrink quality tests that assert different things shrink to particular small values.
This is probably a compatibility issue. I'd recommend running the tests with pytest --pdb and seeing what the type being passed to struct that it doesn't understand is. My guess would be that it's an hbytes where it should be a str or something, but I'm not sure why that would be (getting byte behaviour consistent in Python 2/3 is awful). |
cdf08a1
to
8edb162
Compare
Thanks! I probably won't worry about efficiency so long as it's still usable, but writing some tests to confirm good shrinking behaviour sounds good. |
fd4ae01
to
a79a04e
Compare
Turns out it was a type error naming Numpy subdatatypes after all. I've added some more tests, including minimisation. And along with my typo, an unrelated flaky test. And here's another one. |
3fbd4c5
to
ae76154
Compare
8f9e4b4
to
6e380f6
Compare
I've gone back and squashed in changes to the tests which were otherwise flaky due to hitting the buffer overflow detector (yay it works), and linked to the travis runs that hit other flaky tests above. This has not been a great time for AWS to be down / Travis logs to be randomly irretrievable. I don't know why CircleCI thinks it's still running, if you click through the tests have passed! @DRMacIver - I'm done here. All the tests are passing, I've finally realised that |
Thanks for doing this! I’ve skimmed the patch and it looks good, but other work means I won’t have time for a detailed review for about a week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good in general. I started to add a bunch of comments in line, but they seem to mostly fall into two general patterns so I stopped commenting on every one.
- arguments shouldn't be validated with assert, but instead should raise InvalidArgumentException. You may want to define some helper functions for e.g. checking one of a set of options.
- There are a lot of places where you have boolean flags that I think should actually be separate strategies which people can then or together.
Other than those I think this looks pretty good. I'll need to have another pass through it afterwards, but I think those are the only big things.
Thanks again for doing this! I'm super excited to get the numpy support improved.
src/hypothesis/extra/numpy.py
Outdated
@@ -56,12 +72,19 @@ class ArrayStrategy(SearchStrategy): | |||
def __init__(self, element_strategy, shape, dtype): | |||
self.shape = tuple(shape) | |||
assert shape | |||
self.array_size = reduce(operator.mul, shape) | |||
self.array_size = np.prod(shape) | |||
if self.array_size * dtype.itemsize > settings().buffer_size: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can access this as settings.default.buffer_size rather than creating a new settings object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, what I really want to do is get the size of the current buffer - you're saying that settings.default.buffer_size
reflects any user settings too?
src/hypothesis/extra/numpy.py
Outdated
self.array_size = reduce(operator.mul, shape) | ||
self.array_size = np.prod(shape) | ||
if self.array_size * dtype.itemsize > settings().buffer_size: | ||
raise ValueError(( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InvalidArgumentError from hypothesis.errors is probably the right thing to use here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll switch out all the asserts, but is a value drawn from a strategy conceptually an invalid argument? (I lean towards yes, but worth checking)
src/hypothesis/extra/numpy.py
Outdated
'Insufficient bytes of entropy to draw requested array. ' | ||
'shape={}, dtype={}. Consider increasing settings().' | ||
'buffer_size to more than {} if slow test runs and very slow ' | ||
'minimization are acceptable.').format(shape, dtype, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly add "or reduce the dimensions of the array" as a suggestion?
src/hypothesis/extra/numpy.py
Outdated
assert endianness in ('?', '<', '=', '>') | ||
if isinstance(sizes, int): | ||
sizes = (sizes,) | ||
assert sizes and all(s in (8, 16, 32, 64) for s in sizes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assertions are best not used for argument validation. These should all be "raise InvalidArgumentError" if the condition failed.
src/hypothesis/extra/numpy.py
Outdated
include these values in the real_sizes argument. | ||
|
||
""" | ||
assert real_sizes and all(s in (16, 32, 64, 96, 128) for s in real_sizes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about assertions for argument validation.
src/hypothesis/extra/numpy.py
Outdated
|
||
@st.defines_strategy | ||
def timestamp_dtypes(max_period='Y', min_period='ns', | ||
allow_datetime=True, allow_timedelta=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like datetime and timedelta should probably be independent top level strategies rather than boolean flags to a single strategy. It's not obvious to me that there are many circumstances where having a value which could be either one is a reasonable thing to do, and mixing them together is easy enough to do in those cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. I'll split these up, along with the numeric type strategies. I think string types are confused often enough that it's useful to keep them in one strategy though.
src/hypothesis/extra/numpy.py
Outdated
|
||
|
||
@st.defines_strategy | ||
def scalar_dtypes(integers=True, floats=True, times=True, strings=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not obvious to me that these boolean arguments are an improvement over people writing one\_of(integer_dtypes(), ..., )
themselves. I can see that it's useful to have an 'any scalar dtype' strategy, but do you have a particular use case in mind where it's more desirable to exclude one or two types than it is to explicitly include the types you want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No particular case, so back to a no-argument function. Composing your own allows for tweaking the individual types too, which actually does have usecases (eg 'anything 64bit').
tests/numpy/test_gen_data.py
Outdated
assert isinstance(strat, SearchStrategy) | ||
# And use it to fill an array of that dtype | ||
assume(10 * dtype.itemsize <= settings().buffer_size) | ||
arrays(dtype, 10, strat) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not obvious to me what this arrays call is testing. Note that because of the way deferred strategies work it's not actually going to be running any of the array generation code. Did you maybe want to have an additional data() parameter to given and call data.draw on this so as to force a generation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would look something like this:
@given(nested_dtypes(), st.data())
def test_infer_strategy_from_dtype(dtype, data):
# Given a dtype
assert isinstance(dtype, np.dtype)
# We can infer a strategy
strat = from_dtype(dtype)
assert isinstance(strat, SearchStrategy)
# And use it to fill an array of that dtype
assume(10 * dtype.itemsize <= settings().buffer_size)
data.draw(arrays(dtype, 10, strat))
tests/numpy/test_gen_data.py
Outdated
|
||
|
||
def test_minimise_scalar_dtypes(): | ||
assert find(scalar_dtypes(), lambda x: True) == np.dtype(u'bool') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's the 'minimal' function in hypothesis.internal.debug that you probably want here.
src/hypothesis/extra/numpy.py
Outdated
|
||
|
||
@st.defines_strategy | ||
def array_dtypes(subtype_strategy=scalar_dtypes(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confess I don't really understand what's going on here, but that may be more unfamiliarity with numpy in general and this corner of it in particular. I'd find it useful if there were some tests explicitly for array_dtypes that helped elaborate on what it was for/how to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll improve the docs here, but basically this is NumPy's equivalent to a namedtuple, or struct if you're coming from the C-side. Eg you might have an array of dtype([('time', M8[ns]), ('lat', f8), ('lon', f8)])
representing a GPS track.
To be honest I don't think many people will use this strategy - generally you have a pretty specific idea of what the structure should be - but it's useful to test from_dtypes
and construct the nested case at least!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough.
@alexwlchan, thanks! I'm in no rush, and it's been a pleasure. Using Hypothesis to write tests for strategies is the most fun kind of testing I've done in a long time. @DRMacIver, thanks for the feedback. I'll address the details that I want to discuss inline, and otherwise make the relevant changes. I'll also |
src/hypothesis/extra/numpy.py
Outdated
"""Return a strategy for generating array (compound) dtypes, with members | ||
drawn from the given subtype strategy.""" | ||
assert 1 <= min_size <= max_size | ||
native_text = st.text if text_type is str else st.binary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pedant point I just spotted: This is a native string, not a native text. The whole problem with the 2/3 split is that Python 2 strings aren't really text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not pedantry, it's correctness. Especially important in this context...
docs/numpy.rst
Outdated
|
||
|
||
.. automodule:: hypothesis.extra.numpy | ||
:members: arrays, array_shapes, scalar_dtypes, boolean_dtypes, unsigned_integer_dtypes, integer_dtypes, floating_dtypes, complex_number_dtypes, datetime64_dtypes, timedelta64_dtypes, string_dtypes, array_dtypes, nested_dtypes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is only here because arrays
isn't picked up by Sphinx due to the @st.composite
decorator. Any idea why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. That sounds like a bug in composite. My guess would be that some relevant underlying property is not being copied over. __module__ maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that was pretty much my assumption. I don't think it's worth my time to dive into the details of decorators in this case where it's so easy to work around - I may have a look later, but not as part of this pull.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. Not as part of this pull request is entirely the correct approach. I'll open an issue about this.
da81c64
to
834567d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good, thanks! Two comments in line.
src/hypothesis/extra/numpy.py
Outdated
@st.defines_strategy | ||
def array_shapes(min_dims=1, max_dims=3, min_side=1, max_side=10): | ||
"""Return a strategy for array shapes (tuples of int >= 1).""" | ||
check_argument(1 <= min_dims <= max_dims, u'Invalid dimensions') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, my preference for error messages is that they should always include the values that were invalid (and ideally an explanation of why they were invalid). It makes debugging things a lot less annoying when using an API wrong.
I feel bad making you adhere to my level of pet peeve on this one though, so I'll let you decide whether you care enough to change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, I actually agree - when I get an error message I want it to be good enough that I don't need any other information to fix the problem.
src/hypothesis/extra/numpy.py
Outdated
|
||
|
||
@defines_dtype_strategy | ||
def string_dtypes(byte_strs=True, unicode_strs=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do still think these should be separate. It's more consistent with the rest of the API, and I don't really want to encourage bytes/unicode confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
np.prod is reduce-by-product, and avoids some imports. Initialising an empty array is idiomatic as we're about to overwrite all the zeros, and though irrelevant in this case can be faster for large arrays. Test with 'minimal' instead of 'find' - it's the internal equivalent, with nice settings management and timeouts.
This is particularly useful in composite strategies, where generating pairs of arrays with arbitrary complementary shapes is useful for many tests.
Including a metastrategy for dtypes with a size and endianness.
These are unusual types, and often overlooked - which makes them perfect to throw at code that is meant to handle anything. Also support strategy inference for these types.
Both bytestrings and unicode strings, to force clear consideration of which are handled and how.
Compound (or 'array') dtypes are similar to namedtuples. Now we can generate them too, and infer strategies for such objects. Subarray dtypes (eg '(2,3)i4') are also supported, but not generated with default arguments. Closes HypothesisWorks#470
Because if anyone is using nested array dtypes - think 'nested structs' - in production, they need all the help we can offer.
This is among the most common uses for the other strategies, so let's support it out of the box.
Now lives on it's own page, like Django
The documentation build is currently failing, I think because you forgot to update it after one of the changes. Other than that, LGTM! |
Yep, forgot to change the members line when I split the string strategies - ironic given comments about leaving that in above. The next build is running now 😄, and once it's done you can merge. Any timeline for a release? |
Thanks again for doing this! RE timeline for release: To be honest we're overdue for a release. I'll try to make sure we get one out in the next week or so. |
I've been using Hypothesis to test code using numpy for a while, and accumulated a few utility functions to generate diverse dtypes and arrays. This pull is actually much nicer, since I've taken the time to prod at some unusual dtypes and write a more general API. I hope that this will encourage more software testing in the scientific Python community!